A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making. Unfortunately, the term remains vague and says little about its capability. Recently, the concept of capability level has been introduced to address this issue. Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively. The current work introduces the concept in the context of the built environment. It demonstrates the concept by using a modern house as a use case. The house is equipped with an array of sensors that collect timeseries data regarding the internal state of the house. Together with physics-based and data-driven models, these data are used to develop digital twins at different capability levels demonstrated in virtual reality. The work, in addition to presenting a blueprint for developing digital twins, also provided future research directions to enhance the technology.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
自治系统正在成为海洋部门内无处不在和获得势头。由于运输的电气化同时发生,自主海洋船只可以降低环境影响,降低成本并提高效率。虽然仍然需要密切的监控以确保安全,但最终目标是完全自主权。一个主要的里程碑是开发一个控制系统,这足以处理任何也稳健和可靠的天气和遇到。此外,控制系统必须遵守防止海上碰撞的国际法规,以便与人类水手进行成功互动。由于Colregs被编写为人类思想来解释,因此它们以暧昧的散文写成,因此不能获得机器可读或可核实。由于这些挑战和各种情况进行了解决,古典模型的方法证明了实现和计算沉重的复杂性。在机器学习(ML)内,深增强学习(DRL)对广泛的应用表现出了很大的潜力。 DRL的无模型和自学特性使其成为自治船只的有希望的候选人。在这项工作中,使用碰撞风险理论将Colregs的子集合在于基于DRL的路径和障碍物避免系统。由此产生的自主代理在训练场景中的训练场景,孤立的遇难情况和基于AIS的真实情景模拟中动态地插值。
translated by 谷歌翻译
translated by 谷歌翻译
在这项工作中,我们介绍,证明并展示了纠正源期限方法(Costa) - 一种新的混合分析和建模(火腿)的新方法。 HAM的目标是将基于物理的建模(PBM)和数据驱动的建模(DDM)组合,以创建概括,值得信赖,准确,计算高效和自我不断发展的模型。 Costa通过使用深神经网络产生的纠正源期限增强PBM模型的控制方程来实现这一目标。在一系列关于一维热扩散的数值实验中,发现CostA在精度方面优于相当的DDM和PBM模型 - 通常通过几个数量级降低预测误差 - 同时也比纯DDM更好地概括。由于其灵活而稳定的理论基础,Costa提供了一种模块化框架,用于利用PBM和DDM中的新颖开发。其理论基础还确保了哥斯达队可以用来模拟由(确定性)部分微分方程所控制的任何系统。此外,Costa有助于在PBM的背景下解释DNN生成的源术语,这导致DNN的解释性改善。这些因素使哥斯达成为数据驱动技术的潜在门开启者,以进入先前为纯PBM保留的高赌注应用。
translated by 谷歌翻译
Advances in reinforcement learning have led to its successful application in complex tasks with continuous state and action spaces. Despite these advances in practice, most theoretical work pertains to finite state and action spaces. We propose building a theoretical understanding of continuous state and action spaces by employing a geometric lens. Central to our work is the idea that the transition dynamics induce a low dimensional manifold of reachable states embedded in the high-dimensional nominal state space. We prove that, under certain conditions, the dimensionality of this manifold is at most the dimensionality of the action space plus one. This is the first result of its kind, linking the geometry of the state space to the dimensionality of the action space. We empirically corroborate this upper bound for four MuJoCo environments. We further demonstrate the applicability of our result by learning a policy in this low dimensional representation. To do so we introduce an algorithm that learns a mapping to a low dimensional representation, as a narrow hidden layer of a deep neural network, in tandem with the policy using DDPG. Our experiments show that a policy learnt this way perform on par or better for four MuJoCo control suite tasks.
translated by 谷歌翻译
Hierarchical Reinforcement Learning (HRL) algorithms have been demonstrated to perform well on high-dimensional decision making and robotic control tasks. However, because they solely optimize for rewards, the agent tends to search the same space redundantly. This problem reduces the speed of learning and achieved reward. In this work, we present an Off-Policy HRL algorithm that maximizes entropy for efficient exploration. The algorithm learns a temporally abstracted low-level policy and is able to explore broadly through the addition of entropy to the high-level. The novelty of this work is the theoretical motivation of adding entropy to the RL objective in the HRL setting. We empirically show that the entropy can be added to both levels if the Kullback-Leibler (KL) divergence between consecutive updates of the low-level policy is sufficiently small. We performed an ablative study to analyze the effects of entropy on hierarchy, in which adding entropy to high-level emerged as the most desirable configuration. Furthermore, a higher temperature in the low-level leads to Q-value overestimation and increases the stochasticity of the environment that the high-level operates on, making learning more challenging. Our method, SHIRO, surpasses state-of-the-art performance on a range of simulated robotic control benchmark tasks and requires minimal tuning.
translated by 谷歌翻译
Instruction tuning enables pretrained language models to perform new tasks from inference-time natural language descriptions. These approaches rely on vast amounts of human supervision in the form of crowdsourced datasets or user interactions. In this work, we introduce Unnatural Instructions: a large dataset of creative and diverse instructions, collected with virtually no human labor. We collect 64,000 examples by prompting a language model with three seed examples of instructions and eliciting a fourth. This set is then expanded by prompting the model to rephrase each instruction, creating a total of approximately 240,000 examples of instructions, inputs, and outputs. Experiments show that despite containing a fair amount of noise, training on Unnatural Instructions rivals the effectiveness of training on open-source manually-curated datasets, surpassing the performance of models such as T0++ and Tk-Instruct across various benchmarks. These results demonstrate the potential of model-generated data as a cost-effective alternative to crowdsourcing for dataset expansion and diversification.
translated by 谷歌翻译